Tumor genome wide DNA alterations assessed by array CGH in patients with poor and excellent survival following operation for colorectal cancer.

Genome wide DNA alterations were evaluated by array CGH in addition to RNA expression profiling in colorectal cancer from patients with excellent and poor survival following primary operations. DNA was used for CGH in BAC and cDNA arrays. Global RNA expression was determined by 44K arrays. DNA and RNA from tumor and normal colon were used from cancer patients grouped according to death, survival or Dukes A, B, C and D tumor stage. Confirmed DNA alterations in all Dukes A - D were judged relevant for carcinogenesis, while changes in Dukes C and D only were regarded relevant for tumor progression. Copy number gain was more common than loss in tumor tissue (p < 0.01). Major tumor DNA alterations occurred in chromosome 8, 13, 18 and 20, where short survival included gain in 8q and loss in 8p. Copy number gains related to tumor progression were most common on chromosome 7, 8, 19, 20, while corresponding major losses appeared in chromosome 8. Losses at chromosome 18 occurred in all Dukes stages. Normal colon tissue from cancer patients displayed gains in chromosome 19 and 20. Mathematical Vector analysis implied a number of BAC-clones in tumor DNA with genes of potential importance for death or survival. The genomic variation in colorectal cancer cells is tremendous and emphasizes that BAC array CGH is presently more powerful than available statistical models to discriminate DNA sequence information related to outcome. Present results suggest that a majority of DNA alterations observed in colorectal cancer are secondary to tumor progression. Therefore, it would require an immense work to distinguish primary from secondary DNA alterations behind colorectal cancer.


Introduction
It is assumed that colorectal cancer development constitutes an evolutionary process and a stepwise accumulation of required genetic alterations leading to increased malignancy (Fearon and Vogelstein, 1990). Around 15% of colorectal tumors are characterized by microsatellite instability (MSI or MIN) in combination with various mutations due to defi cient DNA mismatch repair (MMR) genes (Kinzler and Vogelstein, 1996). The majority of malignant colorectal tumors are however characterized by chromosomal instability (CIN) which refers to the appearance of gross chromosomal aberrations including gain and loss of large DNA regions or even whole chromosomes (Lengauer et al. 1998;Rajagopalan et al. 2003). CIN leads to increased inability to maintain genome integrity, although the precise order of genomic events is less defi ned. Opposite to CIN tumors, MSI neoplasms typically retain a near-diploic karyotype and show near normal frequencies of gross-chromosomal aberrations (Bhattacharyya et al. 1994;Parsons et al. 1993;Eshleman et al. 1998). However, aneuploid changes typical for CIN tumors may occur early in low graded dysplastic adenomas, and are therefore proposed as major factors behind progression of colon cancer (Hermsen et al. 2002), although recent observations have questioned whether genetic instability precedes tumor formation (Cardoso et al. 2007). The development of advanced techniques such as high-resolution microarrays (Pinkel et al. 1998;Pollack et al. 1999;Snijders et al. 2001;Ishkanian et al. 2004) provides possibilities for a variety of detailed genomewide screening of DNA copy number changes in malignant tumors as well as epipenetic alterations (Pinkel and Albertson, 2005;Cardoso et al. 2007). Seen together appearing results reveal an unexpected magnitude and complexity of genetic damage in both coding and non-coding regions, in various stages of colorectal cancer (Douglas et al. 2004;Nakao et al. 2004;Buffart et al. 2005;Mehta et al. 2005;Jones et al. 2005;Camps et al. 2006). In the present study, we describe quantitative DNA alterations by array CGH analysis in macrodissected colorectal cancer tissue as related to disease stage and survival following primary operations aimed for cure. Our results add to published information particularly on the difference of DNA alterations in tumors from patients with early relapse and death compared to cured patients.

Patient groups
The patient material comprised 64 patients operated on for sporadic primary colorectal carcinoma. Thirty-two patients who underwent primary surgery in Uppsala county, Sweden between 1988-1990 were subdivided into two groups according to survival. Nineteen patients alive 200 months after primary surgery were grouped as "alive." Thirteen patients who died because of colorectal cancer within 12 months after their primary operation were grouped as "dead." Alive patients comprised 6 males and 13 females classifi ed as 4 Dukes A, 11 Dukes B, and 4 Dukes C; 21% had MSI positive tumors and 53% had tumors with p53 mutations. Dead patients comprised 7 males and 6 females classifi ed as 3 Dukes B, 3 Dukes C and 7 Dukes D; 31% had MSI positive tumors and 62% had tumors with p53 mutations as described elsewhere (Lagerstedt et al. 2005).
Additional 32 patients were included following primary operations in Uddevalla County of Sweden between 2001-2003 and were grouped according to tumor stage by the Dukes A-D classifi cation. Each category of Dukes A, B, C and D comprised 8 patients with 4 males and 4 females, except the Dukes D group, which contained 5 males and 3 females. None of the 64 patients underwent any additional treatment beside surgery according to our institutional standard procedures at the time of operation.

BAC array construction and procedures
Microarrays with complete genome coverage were produced from the 32K BAC clone library (CHORI BACPAC Resources, http://bacpac.chori.org/ genomicRearrays.php) by the Swegene DNA Microarray Resource Center, Department of Oncology, Lund University, Sweden (http:// swegene.onk.lu.se). DOP-PCR products were obtained from BAC DNA template and purifi ed using filter based 96-wells (PALL), dried and re-suspended in 50% DMSO. Arrays were printed on UltraGAPS slides (Corning) using a MicroGrid II spotter (Biorobotics) as described in details elsewhere (Jonsson et al. 2005a Six 32K tiling BAC arrays were used to determine DNA copy number alterations in pooled tumor DNA from patients grouped as dead, alive, Dukes A, B, C and D in comparison to reference DNA (Human Genomic DNA from whole blood, Clontech, BD Biosciences). Array was run on tumor DNA from dead patients versus tumor DNA from alive patients. Normal colon tissue DNA from dead and alive patients was also hybridized against reference DNA. cDNA array analyses of DNA were also used to compare with observations found in BAC array analyses (Fig. 4).
Overall chromosomal aberrations were given as the number of BAC clones considered altered (gain or loss of copy number) divided by the total number of clones in the genome wide evaluation where X and Y-chromosomes were excluded.
DNA was extracted from fresh frozen primary colorectal carcinomas and normal colon tissue (down to serosa layer) with Qiamp DNA Mini kit (Qiagen) according to instructions. All tumors contained around 60-80% neoplastic cells according to separate estimates, with remaining 20%-40% containing endothelial, stromal and infl ammatory cells. Sample labeling and hybridization were performed as described (Jonsson et al. 2005a). Briefl y, 1.5-3 μg genomic DNA from patients and reference DNA was differentially labeled with Cy5-dCTP or with Cy3-dCTP (Amersham Biosciences) using random primer labeling (Bioprime array CGH genomic Labeling module, Invitrogen). Labeled sample and reference DNA were mixed and unincorporated nucleotides were removed using CYScribe GFX purifi cation kit (Amersham Biosciences) prior to coprecipitation with human Cot-1 DNA. The labeling reactions were applied to arrays and incubated for 72 h at 37 °C. Slides were washed and scanned in Agilent microarray scanner (Agilent Technologies). Identifi cation of individual spots on scanned arrays was performed with GenePix Pro 4.0 (Axon Instruments).
cDNA array construction and procedures cDNA microarrays containing 27,648 sequenceverifi ed IMAGE clones from the Research Genetics IMAGE clone library were obtained from the Swegene DNA Microarray Resource Center at Lund University (http://swegene.onk.lu.se). 6 μg of sample and reference DNA were labeled and hybridized according to previously described procedures for BAC arrays except that cDNA arrays were hybridized at 42 °C.

RNA extraction and microarray expression
Tumor and normal colon tissue RNA was either extracted with TRIzol reagent (Invitrogen Life Technologies). mRNA was linearly amplifi ed with BD smart mRNA amplifi cation kit (BD Biosciences, Clontech, Palo Alto, CA, U.S.A.), or extracted with Rneasy Fibrous Tissue Kit (Qiagen) where mRNA was selected with mRNA Purifi cation Kit (Amersham Biosciences). RNA fractions were quality controlled in a Bioanalyzer (Agilent Technologies) and quantifi ed by a NanoDrop ND-1000A Spectrophotometer (NanoDrop Technologies Inc). 400 ng polyA + mRNA from tumor and normal colon were labeled with Cy3-dCTP and Cy5-dCTP respectively (Amersham Biosciences) with Agilent Fluorescent Direct Label Kit and samples were hybridized to 44K Human Whole Genome Oligo Microarrays (Agilent Technologies) using the In situ Hybridization Kit Plus (Agilent Technologies), incubated at 60 °C for 18 hours and scanned on an Agilent Microarray scanner. Three patients were hybridized individually (with technical replicates, dye-swaps) and six patients were pooled and run as a single experiment. Data were processed in Feature Extraction Software, v.7.5 (FE) (Agilent Technologies), background was subtracted, outliers fl agged and dyes were normalized with linear and lowess. Processed signals from FE output fi les were imported into GeneSpring Software, v.7.2 (Silicon Genetics, Agilent Technologies) with Agilent Feature Extraction plug-in. Dye-transformation of specifi ed samples, normalizations per spot/divided by control channel as well as per chip/normalized to 50th percentile and fi ltering on fl ags were performed. Processed data from three individual patients and a pool of six patients were combined and the 99% confi dence interval was calculated from merged data to identify genes with aberrant expression. Patient data represent gene expression in tumors from Dukes A (1), Dukes B (2), Dukes C (4) and Dukes D (2) from fi ve females and four males.

DNA image analysis, data processing and statistics
Images were quantifi ed on an Agilent G2565AA microarray scanner (Agilent Technologies, Palo Alto, CA). Fluorescence intensities were extracted using the Genepix Pro 4.0 software (Axon Instruments Inc, Foster City, CA) uploaded into Bio Array Software Environment (BASE) open source software (http:// base.thep.lu.se) for further analysis (Saal et al. 2002). Data analysis was performed in BASE as described (Jonsson et al. 2005b). Briefl y, intensity ratios for each spot were obtained by calculating background corrected Cy3 and Cy5 intensities from the median and local background pixels. Spots with Cy3 and Cy5 intensities Ͼ65000 and a signal to noise ratio Ͻ1.5 and a spot radius Ͼ40 were excluded from the data set in BAC analyses, while cDNA ratios in spots were handled similarly without any restriction in signal intensities. Spots indicated as fl ags by the Genepix software were removed prior to normalization by the Lowess curve fi t method for both platforms (Yang et al. 2002). A moving average of three clones was applied and BASE implementation of CGH Plotter was used to determine deletion/amplicon boundaries (Autio et al. 2003). Noise constant was set to 15 and amplifi cation/deletion limits was set to log(2) values of ±0.2. High reproducibility considering log(2) values was obtained for all BAC clones within the 32K array with a mean SD of 0.135 in self versus self hybridizations. Further, analysis of cells with different numbers of X-chromosomes, demonstrated a linear rise in log(2) values for X-clones (unpublished). Mapping information was retrieved from the USCS Genome Browser (March 2005 freeze). The uniformity of log(2) ratio distribution in chromosomes as well as complete data sets were tested and confi rmed by the Kolmogorov-Smirnoff test. Only autosomal clones were included in the analysis. The SD calculated from log(2) ratios from all samples was 0.14. Differences between samples were analyzed with χ 2 -test and corrected by Bonferroni statistical adjustments.
Vector analysis was performed on data from hybridization of tumor DNA from dead and alive patients vs reference DNA from normal subjects. Net alterations in hybridization log(2) ratios were graphed in a two-dimensional coordinate system, where the different quadrants confi rm conditions or events that promote death or alive events directly or indirectly related to genetic deviations compared to normal reference DNA.

Genome wide alterations in tumor tissue vs normal colon tissue
The number of aberrant clones ranged from 1-15% (genome wide) to 82% for individual chromosomes in tumor DNA (Table 1). Copy number gains were signifi cantly more common than loss of DNA sequences (p Ͻ 0.01). Structural DNA alterations in tumor tissue versus normal DNA were found in each chromosome. Chromosomes with the highest prevalence of altered BAC clones were 8, 13, 18 and 20 and least altered chromosomes were 1, 2, 3, 5, 6 and 11. The size of copy number loss ranged from 4 to 351 BAC clones corresponding to 210 kbp to 36 Mbp. The extent of gains and amplifi cations ranged from 2 to 599 BAC clones, corresponding to 27 kbp to 55 Mbp. No incidence of homozygous deletions was observed.
RNA expression profi les in tumor tissue from colorectal cancer patients of the same cohort displayed 78 genes with significantly increased expression and 140 genes with decreased expression in tumor tissue vs normal colon tissue. Figure. 1D shows the spectrum of expression along the genome compared to observed structural DNA alterations (Figs. 1A-C).

Genome wide DNA alterations in tumor tissue from dead and alive patients
Four percent, 8% and 2% of the BAC-clones of autosomal chromosomes were altered in tumor DNA analyzed from various sets of hybridization; (dead/alive, dead/reference, alive/reference) ( Table 1). Copy number gain was more common than loss (p Ͻ 0.01) and dead patients had a higher frequency of genome wide gain and loss in tumor DNA than alive patients (p Ͻ 0.01). Several chromosomes showed major DNA alterations, namely chromosomes 8, 13, 18 and 20 in tumor tissue (Table 1, Fig. 2).

DNA alterations related to tumor progression (Dukes A + B, C + D)
Seven percent, 4%, 15% and 12% of BAC-clones representing autosomal chromosomes were altered in Dukes A, B, C and D tumors respectively (Table 1). Copy number gain was signifi cantly more common than copy number loss in all Dukes stages (p Ͻ 0.01). Gains were found in chromosome 7 (9%-45%), 8 (1%-55%) and 20 (46%-76%) in  Dukes A-D ( Large-scale copy number variation in normal colon tissue DNA Confi rmed and unconfi rmed large-scale copy number variaton was observed in normal colon tissue from cancer patients with different clinical outcome (Iafrate et al. 2004;Sebat et al. 2004;Eichler, 2006). These changes are summarized in Table 3. Alive patients displayed only confi rmed CNV locus while both confirmed and unconfirmed DNA alterations occurred in our dead patients. Such unconfi rmed DNA locus were evaluated for candidate genes with importance for tumor progression according to proliferation or apoptosis (Table 6). Figure 5 demonstrates distributions of DNA alterations between dead and alive patients in whole genome hybridizations versus normal reference DNA. Each observation indicates its proportional weight in vectors moving either towards death or survival. According to this plot, we ranked the 20 most extreme BAC-clones contributing to death events due to copy number gain or loss. Genes in these DNA regions represent candidates related to disease specifi c mortality as presented in Table 7.

Discussion
The present study evaluates structural (sequence) alterations in DNA isolated from tumor tissue obtained at primary curative resections of tumor DNA from alive patients vs normal reference DNA (C). Relative chromosomal copy number is given on the y-axis as the log(2) ratio. Each ratio represents a BAC clone on the array. Values of log(2) ratios above 0.2 were regarded gain of copy number and log(2)ratios below −0.2 were considered loss of copy number. Alive patients were cured from colorectal cancer with more than 10 years survival, while dead patients did not survive beyond 1 year following their primary operation. a) is the ± 0.2 log(2) ratio (∼95% confidence limit) determined by CGH plotter analysis software. Panel D shows RNA expression in tumor tissue vs normal colon tissue RNA from a comparable group of 9 cancer patients (Dukes A -D) selected by chance from the main patient cohort. b) represents ±2.6SD (99% confi dence interval).

MCM8
May have a role in control of cell proliferation 20p11.23

RBBP9
May play a role in the regulation of cell proliferation and differentiation Table 6. Unconfi rmed CNP locus and corresponding genes with known function in DNA from normal colon tissue obtained from dead patients of potential importance for interactions to predict death events.

Copy Number Cytoband Gene Name Protein function BAC clone Change
Gain 18q11.2 GATA-6 Translation factor that may be RP11-121I20-RP11-219C07 important for regulating terminal differentiation and/or proliferation CTAGE1 Cutaneous T-cell lymphomaassociated antigen  Bardi et al. 2004;Diep et al. 2004;Chang et al. 2006;Diep et al. 2006). The traditional approach in such efforts is to investigate a number of patients with statistical power to relate genetic alteration to survival and treatment response. This approach, with genome wide analyses on material from individual patients on large cohorts, is restricted by fi nancial costs and statistical aspects in microarray analyses. Therefore, we chose an alternative approach with analyses on pooled DNA prepared from individual patients grouped according to clinical outcome or tumor stage (Dukes A-D), which represents a more robust model with less by chance variation considering the large number of clones (∼32000) in each assay. Thus, a model based on pooled DNA and RNA provides more stabilized information by canceling out random variations as emphasized by Cardoso et al. (Cardoso et al. 2007). Patients with either poor or excellent survival following surgery were selected from a large cohort of patients with colorectal cancer selected by chance and subjected to standard treatment at our institution. In a group from all operated patients during 1990 and 2002 we randomly selected 13 patients who died in colorectal cancer within 12 months vs 19 patients who survived for more than ten years, which is statistically equal to be cured. A limitation in analyses on pooled DNA is that small but signifi cant structural alterations may be unidentifi ed and thereby decrease the sensitivity of analyses. However, as a screening procedure for evaluation of major factors, our approach is statistically superior. In order to decrease the risk for misinterpretations in conclusion of results from dead vs alive patients we also confi rmed such results by hybridization of tumor DNA vs reference leucocyte DNA commercially available from healthy individuals. Given the existence of copy number variants of relatively high frequency in general population (Redon et al. 2006) it may not be benefi cial to analyze a defi ned "normal reference DNA." However, this comparison is regarded on internal analytical standard being commercially available world wide. DNA alterations detected in the surgically removed tumors represents the sum of changes accumulated during disease progression (Rajagopalan et al. 2003;Michor et al. 2005). It is possible that certain alterations are critical for carcinogenesis while other may promote invasive growth and metastatic spread (Buffart et al. 2005;Mehta et al. 2005;Ghadimi et al. 2003;Saha et al. 2001). Considerable efforts have been devoted to delineate differences between early and late events in colorectal cancer development. (Lengauer et al. 1998;Lengauer et al. 1997). Theoretically, it may well be that critical genetic events during carcinogenesis are less important for tumor progression and vise versa (Hunter, 2004). Here, we approached this concept by comparing DNA alterations in patients with tumors of well-defi ned clinical stage according to Dukes. Accordingly, patients with tumor stage of Dukes A and B have world wide clear cut better outcome compared to patients with Dukes C and D stage. Therefore, when a defi ned DNA alteration occurs in all Dukes A-D stages, and is not present in normal tissue, it should be related to carcinogenesis and early progression. On the other hand, when alterations appear in Dukes C and D tumors only, they should be associated with tumor progression.
In general, our study reveals that DNA copy number gains are more frequent than losses in colorectal cancer. Based on above principles we observed a number of alterations that distinguish tumors with excellent versus poor prognosis, most obvious being the alterations on chromosome 8.  where observed in Dukes C + D but not in Dukes A + B. Unexpectedly, it was observed that normal colon tissue harbored quantitative DNA alterations (gains at chromosome 19 and 20) also found in Dukes C + D tumors, which contradict their connection to tumor progression. These DNA alterations may refl ect the toxic environment that colon epithelial is exposed to during life-time predisposing to carcinogenesis, but it may also represent CNVs among different subject populations. Several studies have implied critical DNA alterations that predict clinical outcome. Many such reports have been evaluated in less complex experimental models as cultured tumor cells, where signal transduction pathways in control of cell proliferation and apoptosis are well described. However, overall genomic aberrations observed in the present material appear a major challenge to distinguish primary from secondary DNA alterations. The regions defi ned here (e.g. on chromosome 8) include several hundred of altered genes that may co-variate with other disease specifi c alterations without having any primary cause-effect relationship on either carcinogenesis or progression. A hint to this perspective may be to compare structural DNA alterations to signifi cant altered RNA expression along the genome which provide information on DNA alterations in expressed genes (Pollack et al. 2002). Accordingly, Pollack et al. estimated that approximately 12% of variations in gene expression in breast cancer could be attributed to underlying copy number changes. Corresponding rough estimates on the present material may be around 5% considering signifi cantly altered expression versus copy number changes in tumors. Definite information on altered expression versus copy number changes must await analyses on the RNA and DNA from the same tissue specimen (Cardoso et al. 2007), which is under way in our laboratory. Therefore, we used two dimensional vector analysis to sort out the 20 most extreme alterations related to poor survival and found that a majority of these genomic regions (represented by BAC clones) contained only a few known genes that may be related to cancer progression. This dilemma would require more thorough comparisons with gene expression and functional studies and it is not resolved simply by adapting available models of bioinformatics on genomic data. Obviously, there is no simple solution to rank positive and negative factors in prediction of clinical outcome, since it will demand genome-wide analyses on several tenthousands of patients to resolve such predictions by classic statistics. The situation appears even more problematic considering redundant metabolic pathways to overcome established defects in the control of gene expression including epigenetic changes and micro RNAs (Feinberg et al. 2006;Michael et al. 2003). In this perspective it presently appears an impossible mission to resolve these questions by available models.
In conclusion, the results in the present study demonstrate that tiling array CGH is a powerful approach for genome-wide identifi cation of DNA copy number alterations in pooled DNA from cancer patients. We used pools of tumors from clinically and/or pathologically well-defined patient subgroups selected randomly to sort out only major genomic patterns related to carcinogenesis, tumor progression and prognosis. Despite this approach our results demonstrate an enormous number of DNA sequences that may explain carcinogenesis, tumor growth progression and disease specifi c mortality. A next step should be to distinguish primary DNA events from secondary covariates to explain disease progression, although this presently seems an overwhelming task.